Using the pulmonary model from the SECRET competition.
Load in design, data:
design <- readRDS('data/designW1.rds')
t <- 512 # number of timepoints
v <- 3 # number of variables
n <- 100 # number of simulations
all_data <- array(0, dim = c(t,v,n))
for (i in 1:n){
tmp <- readMat(paste0('data/output/flow', i, '.mat'))[[1]]
all_data[,,i] <- tmp
}
Plot all data:
plot_data <- data.frame(Time = 1:t,
Run = rep(1:n, each = t*v),
Output = c(all_data),
Type = rep(c('Flow1', 'Flow2', 'Pressure'), each = t))
plot_data2 <- melt(plot_data, id.vars = c('Time', 'Run', 'Type'))
ggplot(plot_data2, aes(x = Time, y = value, col = as.factor(Run))) +
geom_line() +
facet_wrap(vars(Type), nrow = 2, scales = 'free_y') +
theme(legend.position = 'none')
Let’s select timepoint 200 from Flow1 as our output. Creating dataframe, and plotting against 1 of the inputs:
design$ID <- NULL # don't need this column
tData <- data.frame(design[,1:4], y = all_data[200,1,])
head(tData)
## kMV alpha lrrA lrrV y
## 1 151891.4 0.8349728 38.71343 29.62342 12.70690
## 2 292699.7 0.8340668 31.45554 40.85423 16.30124
## 3 218213.8 0.8719307 21.63183 45.98160 18.45270
## 4 122811.4 0.8310381 29.49230 35.13491 12.78105
## 5 223381.2 0.8681403 36.56351 24.98882 17.55068
## 6 169484.9 0.8741946 41.77370 31.13585 13.78499
ggplot(tData, aes(kMV, y)) + geom_point()
The inputs here are on very different scales. We should standardise these prior to modelling. Here, we have a fixed range for each input, so we use the min/max for each to scale these to [-1,1]:
tData$kMV <- (tData$kMV - 9*10^4) / ((3*10^5 - 9*10^4)/2) - 1
tData$alpha <- (tData$alpha - 0.83) / ((0.89 - 0.83)/2) - 1
tData$lrrA <- (tData$lrrA - 20) / ((50 - 20)/2) - 1
tData$lrrV <- (tData$lrrV - 20) / ((50 - 20)/2) - 1
summary(tData)
## kMV alpha lrrA
## Min. :-0.9950701 Min. :-0.9808387 Min. :-0.9966622
## 1st Qu.:-0.5008905 1st Qu.:-0.5010414 1st Qu.:-0.4937711
## Median :-0.0036676 Median :-0.0004437 Median : 0.0001145
## Mean : 0.0005379 Mean : 0.0000804 Mean :-0.0003623
## 3rd Qu.: 0.5038140 3rd Qu.: 0.4911227 3rd Qu.: 0.5017052
## Max. : 0.9856529 Max. : 0.9953333 Max. : 0.9858235
## lrrV y
## Min. :-0.9977574 Min. : 9.342
## 1st Qu.:-0.4922606 1st Qu.:13.197
## Median :-0.0033427 Median :15.172
## Mean :-0.0007766 Mean :15.381
## 3rd Qu.: 0.4986679 3rd Qu.:17.170
## Max. : 0.9948385 Max. :25.910
We probably want to split into training/test sets at this point. We could do this manually, or we can handle this internally in the emulation code.
To use this code, there’s 1 more thing we need to do. From the notes in Gasp.R, the input data must have the structure [design, Noise, output]. This is because a noise term is used in the selection of a mean function:
tData <- data.frame(tData[,1:4],
Noise = runif(nrow(tData), -1, 1),
y = tData$y)
Now we can run the code (setting a seed for reproducibility, because this function will randomly split the data into training/validation):
set.seed(2101)
em1 <- BuildGasp('y', tData)
## The upper bounds of the range parameters are 117.4767 120.7062 121.1018 122.6679 Inf
## The initial values of range parameters are 2.349534 2.414125 2.422036 2.453357
## Start of the optimization 1 :
## The number of iterations is 19
## The value of the marginal posterior function is -108.855
## Optimized range parameters are 1.637116 2.617475 3.509501 3.322867
## Optimized nugget parameter is 0.0005770015
## Convergence: TRUE
## The initial values of range parameters are 1.762785 1.811246 1.817181 1.840681
## Start of the optimization 2 :
## The number of iterations is 22
## The value of the marginal posterior function is -108.855
## Optimized range parameters are 1.637116 2.617475 3.509501 3.322867
## Optimized nugget parameter is 0.0005770015
## Convergence: TRUE
summary(em1)
## Length Class Mode
## em 1 rgasp S4
## type 1 -none- character
## mean_fn 0 -none- NULL
## train_data 5 data.frame list
## validation_data 5 data.frame list
This has stored an rgasp emulator under $em, what mean function was used (here none, hence NULL), and also what the training data and validation data were. If you look at Gasp.R or the raw code for BuildGasp, you’ll see there’s an input training_prop, which will split the data into training and validation sets, with a default of 75% used for training.
We can validate in several ways:
par(mar = c(4,2,2,2));ValidateGasp(em1)
This function has several options. If you provide it with only an emulator, it will predict over the validation data stored in the BuildGasp object. You can alternatively provide it with a new dataset. You can also get it to plot the predictions against the inputs:
par(mfrow = c(2,3), mar = c(4,2,2,2));ValidateGasp(em1, IndivPars = TRUE)
This emulator might not be suitable: these are 95% prediction intervals, and we have 5/25 points coloured red, i.e. 20% of our predictions are outside these intervals.
Alternatively, can do leave-one-out across the training data:
par(mar = c(4,2,2,2));LeaveOneOut(em1)
We’re doing a bit better here - have around 5% outside 95%.
We could try something more complicated in the mean function. Could just do linear:
set.seed(5820)
em2 <- BuildGasp('y', tData, mean_fn = 'linear')
## The upper bounds of the range parameters are 104.4217 103.118 103.1081 105.4503 Inf
## The initial values of range parameters are 2.088434 2.062359 2.062162 2.109005
## Start of the optimization 1 :
## The number of iterations is 22
## The value of the marginal posterior function is -92.54723
## Optimized range parameters are 1.438671 1.37506 1.529972 1.708329
## Optimized nugget parameter is 0.0007741812
## Convergence: TRUE
## The initial values of range parameters are 1.822727 1.79997 1.799798 1.840681
## Start of the optimization 2 :
## The number of iterations is 21
## The value of the marginal posterior function is -92.54723
## Optimized range parameters are 1.438671 1.37506 1.529972 1.708329
## Optimized nugget parameter is 0.0007741812
## Convergence: TRUE
par(mfrow=c(1,2),mar = c(4,2,2,2));ValidateGasp(em2);LeaveOneOut(em2)
Or could fit something more general:
set.seed(3100329)
em3 <- BuildGasp('y', tData, mean_fn = 'step')
par(mfrow=c(1,2),mar = c(4,2,2,2));ValidateGasp(em3);LeaveOneOut(em3)
This final emulator appears to validate better than the others. Here, we’ve fitted a more complicated mean surface initially - we have some new entries in our emulator object:
summary(em3)
## Length Class Mode
## em 1 rgasp S4
## lm 11 -none- list
## active 4 -none- character
## type 1 -none- character
## mean_fn 1 -none- character
## train_data 5 data.frame list
## validation_data 5 data.frame list
$lm is a fitted linear regression object, giving this more complex mean structure:
summary(em3$lm$linModel)
##
## Call:
## lm(formula = y ~ lrrA + I(lrrA^2) + kMV + lrrV + alpha + I(kMV *
## lrrA) + I(lrrV * lrrA) + I(lrrV * kMV), data = tData)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.32718 -0.47132 0.08911 0.41029 1.72194
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.1490 0.1118 135.527 < 2e-16 ***
## lrrA -3.4397 0.1323 -26.008 < 2e-16 ***
## I(lrrA^2) 0.8757 0.2647 3.309 0.001521 **
## kMV 2.8495 0.1253 22.744 < 2e-16 ***
## lrrV -2.0387 0.1328 -15.348 < 2e-16 ***
## alpha 1.9137 0.1305 14.661 < 2e-16 ***
## I(kMV * lrrA) -1.2418 0.2241 -5.541 5.65e-07 ***
## I(lrrV * lrrA) 0.9021 0.2256 3.999 0.000163 ***
## I(lrrV * kMV) -0.6761 0.2214 -3.054 0.003252 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6281 on 66 degrees of freedom
## Multiple R-squared: 0.9581, Adjusted R-squared: 0.953
## F-statistic: 188.7 on 8 and 66 DF, p-value: < 2.2e-16
$active is a list of which of the input variables are treated as active when fitting the GP to the residuals (essentially, in the mean fitting process we may find that some inputs are just noise, hence we don’t include these when fitting the covariance):
em3$active
## [1] "lrrA" "kMV" "lrrV" "alpha"
Here, all 4 of the inputs are considered active, but in general, as the dimensionality of the input space increases, it’s more common for this to be a subset of the inputs.
By default, this mean function is not allowed more terms than 0.1*number of training points. However, this won’t always work, and option maxdf allows flexibility in this specification (e.g., we may find we overfitted the data and want to limit the number of terms being added, or we may find that 10% is too restrictive and we need to add more).
The above emulated a single output, but this code could be used to emulate \(\ell\) outputs by looping over different outputs.
Alternatively, we could use dimension reduction to simplify this task. Let’s emulate Flow1. First, we want to construct a basis. This is easy to do (this function has a lot more flexibility than is usually needed, e.g., can do weighted SVD for general matrices):
DataBasis <- MakeDataBasis(all_data[,1,])
summary(DataBasis)
## Length Class Mode
## tBasis 51200 -none- numeric
## CentredField 51200 -none- numeric
## EnsembleMean 512 -none- numeric
## Q 0 -none- NULL
## Lambda 0 -none- NULL
dim(DataBasis$tBasis)
## [1] 512 100
dim(DataBasis$CentredField)
## [1] 512 100
By default, this function subtracts the ensemble mean, and then calculates the SVD across the (centred) data. $Q and $Lambda are only stored if this function is used for weighted SVD, hence are not returned here (and indeed, not required in future calculations).
Plotting the leading few vectors:
q <- 9
plot_basis <- data.frame(Time = rep(1:t, q),
Vector = rep(1:q, each = t),
Weight = c(DataBasis$tBasis[,1:q]))
ggplot(plot_basis, aes(Time, Weight)) +
geom_line() +
facet_wrap(vars(Vector))
Generally we truncate this basis for some small \(q\) (such that a large enough amount of variability has been explained and/or such that any later vectors are just noise, with no signal from the parameters).
q <- ExplainT(DataBasis, vtot = 0.95)
q
## [1] 4
Here, 4 vectors explain 95% of the variability in the data. In practice, we might want to try emulating the 5th, 6th etc. to see if these are predictable, but let’s just use the 1st 4 for now.
Alternatively, plot cumulative variance explained:
vars <- lapply(1:n, function(k) VarExplained(DataBasis$tBasis[,1:k], DataBasis$CentredField))
ggplot(data.frame(q = 1:n, Proportion = unlist(vars)), aes(x = q, y = Proportion)) +
geom_line() +
ylim(0,1)
Projecting the data onto the basis:
Coeffs <- Project(data = DataBasis$CentredField,
basis = DataBasis$tBasis[,1:q])
colnames(Coeffs)[1:q] <- paste("C",1:q,sep="")
summary(Coeffs)
## C1 C2 C3 C4
## Min. :-63.952 Min. :-42.022 Min. :-25.063 Min. :-27.300
## 1st Qu.:-28.476 1st Qu.:-20.689 1st Qu.:-10.353 1st Qu.: -5.812
## Median : -5.733 Median : 0.856 Median : -4.299 Median : -1.565
## Mean : 0.000 Mean : 0.000 Mean : 0.000 Mean : 0.000
## 3rd Qu.: 22.744 3rd Qu.: 19.789 3rd Qu.: 5.128 3rd Qu.: 5.334
## Max. :162.433 Max. : 50.687 Max. : 81.680 Max. : 27.256
tDataC <- data.frame(tData[,1:4], # getting the scaled version from before
Noise = runif(n, -1, 1),
Coeffs)
head(tDataC)
## kMV alpha lrrA lrrV Noise C1
## 1 -0.4105579 -0.8342397 0.2475622 -0.358438412 -0.94629091 -57.654972
## 2 0.9304734 -0.8644406 -0.2362974 0.390282117 0.63345668 -20.741136
## 3 0.2210836 0.3976897 -0.8912116 0.732106785 0.75267760 47.747129
## 4 -0.6875105 -0.9653981 -0.3671799 0.008994258 -0.20308374 -51.121585
## 5 0.2702968 0.2713426 0.1042343 -0.667411738 0.38438111 21.977666
## 6 -0.2430006 0.4731540 0.4515797 -0.257609746 0.02633979 -5.173414
## C2 C3 C4
## 1 -21.804896 9.724607 1.470627
## 2 -32.018533 -9.987071 10.134155
## 3 1.772807 -19.527830 -6.458404
## 4 -7.456377 15.738800 -13.113496
## 5 -18.626407 -6.811848 -6.335217
## 6 21.054044 -2.062772 -4.182515
Now we can build emulators for these \(q\) coefficients. We can do this simultaneously for all \(q\) (as long as we’re happy using the same assumptions for each). To make things consistent across each emulator, I’m going to define training/validation sets by hand, and then fit to the full training set by setting training_prop = 1 (rather than allowing the function to do this split for me):
set.seed(321)
inds <- sample(1:n, n)
train_inds <- inds[1:75]
val_inds <- inds[-c(1:75)]
train_data <- tDataC[train_inds,]
val_data <- tDataC[val_inds,]
em_coeffs <- BasisEmulators(tDataC, q, mean_fn = 'step', maxdf = 5, training_prop = 1)
The output here is now a list of \(q\) emulator objects, e.g.
summary(em_coeffs[[1]])
## Length Class Mode
## em 1 rgasp S4
## lm 11 -none- list
## active 4 -none- character
## type 1 -none- character
## mean_fn 1 -none- character
## train_data 5 data.frame list
## validation_data 0 -none- NULL
summary(em_coeffs[[2]])
## Length Class Mode
## em 1 rgasp S4
## lm 11 -none- list
## active 4 -none- character
## type 1 -none- character
## mean_fn 1 -none- character
## train_data 5 data.frame list
## validation_data 0 -none- NULL
Now in ValidateGasp, we need to provide the validation set (it’s no longer internal to the BasisEmulator object):
par(mfrow=c(2,2), mar=c(4,2,2,2))
ValidateGasp(em_coeffs[[1]], val_data)
ValidateGasp(em_coeffs[[2]], val_data)
ValidateGasp(em_coeffs[[3]], val_data)
ValidateGasp(em_coeffs[[4]], val_data)
par(mfrow=c(2,2), mar=c(4,2,2,2))
LeaveOneOut(em_coeffs[[1]]);LeaveOneOut(em_coeffs[[2]]);LeaveOneOut(em_coeffs[[3]]);LeaveOneOut(em_coeffs[[4]])
The validation plots above are performing prediction across the test data. In general, we can predict for any sets of inputs, for either a 1D emulator, or for a set of basis emulators. Doing so across a space-filling design in parameter space (here, we are still working in [-1,1]):
ns <- 1000 # usually want more, but set low for speed
BigDesign <- 2*as.data.frame(randomLHS(ns, 4)) - 1
colnames(BigDesign) <- colnames(design)[1:4]
Preds_1D <- PredictGasp(BigDesign, em3)
Preds_basis <- BasisPredGasp(BigDesign, em_coeffs)
These store slightly different things - in the 1D case, we get mean/sd/lower95/upper95 (the same as what predict.rgasp returns):
summary(Preds_1D)
## Length Class Mode
## mean 1000 -none- numeric
## lower95 1000 -none- numeric
## upper95 1000 -none- numeric
## sd 1000 -none- numeric
In the basis case, I only store $Expectation and $Variance (as this is all we need for history matching, and because prediction intervals can be derived from these, so don’t want to store these if we have a lot of emulators):
summary(Preds_basis)
## Length Class Mode
## Expectation 4000 -none- numeric
## Variance 4000 -none- numeric
Both of these objects are \(ns \times q\) dimensional, where each row corresponds to an input we’re predicting at, and each column corresponds to a basis vector:
dim(Preds_basis$Expectation)
## [1] 1000 4
dim(Preds_basis$Variance)
## [1] 1000 4
From basis coefficients (whether given by projection, or predicted by an emulator), we can reconstruct a prediction of the original field.
Let’s consider the runs from the validation set. First produce predictions for them:
Preds_val <- BasisPredGasp(val_data, em_coeffs)
Let’s just take the first run, and compare the mean/variance to the truth (in coefficient/projection space):
data.frame(Truth = as.numeric(val_data[1,-c(1:5)]),
Mean = Preds_val$Expectation[1,],
Var = Preds_val$Variance[1,])
## Truth Mean Var
## 1 -22.05506 -20.082626 1.2726158
## 2 31.87312 32.698060 2.5789602
## 3 12.18553 12.623414 0.9006701
## 4 3.29881 3.858779 2.6456150
Plot the true run vs emulator reconstruction of it (still with the ensemble mean removed, as this will dominate):
plot_data <- data.frame(Time = 1:t,
Truth = DataBasis$CentredField[,val_inds[1]],
Recon = Recon(Preds_val$Expectation[1,], DataBasis$tBasis[,1:q]))
plot_data2 <- melt(plot_data, id.vars = c('Time'))
ggplot(plot_data2, aes(Time, value, col = variable)) + geom_line()
Looks quite accurate (remember this run was not used for fitting the emulator).
Mean prediction by itself is not that informative, also sample from the emulators:
em_samp <- matrix(0, t, 100)
for (s in 1:100){
samp <- rnorm(q,
mean = Preds_val$Expectation[1,],
sd = sqrt(Preds_val$Variance[1,]))
rec <- Recon(samp, DataBasis$tBasis[,1:q])
em_samp[,s] <- rec
}
ggplot(plot_data2, aes(Time, value, col = variable)) +
geom_line(data = data.frame(Time = 1:t, value = c(em_samp), s = rep(1:100, each = t)), aes(Time, value, linetype = as.factor(s)), col = 'grey', alpha = 0.6) +
geom_line(size = 1.25) +
scale_linetype_manual(values = rep(1,100), guide = 'none')
Really, we should also add the uncertainty due to the discarded basis vectors - by truncating the basis after \(q\) vectors, we know that we’re not perfectly reconstructing the true simulations via the emulators, and need to acknowledge this additional variability. It is likely small relative to the variability we’ve explained (less than 5%) but we should still account for it.
There’s a function that will calculate this variance matrix, and we can sample from it:
extra_var <- DiscardedBasisVariance(DataBasis, q)
extra_var_samples <- rmvnorm(100, rep(0, t), extra_var)
dim(extra_var_samples)
## [1] 100 512
plot_samples <- data.frame(Time = 1:512,
epsilon = c(t(extra_var_samples)),
s = rep(1:100, each = t))
ggplot(plot_samples, aes(Time, epsilon, col = as.factor(s))) +
geom_line(alpha = 0.6) +
theme(legend.position = 'none')
The above is 100 samples from the discarded vectors, and we have clear correlated structure in these (some of the later basis vectors are likely quite noisy, however these will have lower variance/eigenvalues, so are drowned out by the earlier, more structured, discarded vectors).
Adding this variability onto our emulator samples:
ggplot(plot_data2, aes(Time, value, col = variable)) +
geom_line(data = data.frame(Time = 1:t, value = c(em_samp + t(extra_var_samples)), s = rep(1:100, each = t)), aes(Time, value, linetype = as.factor(s)), col = 'grey', alpha = 0.6) +
geom_line(size = 1.25) +
scale_linetype_manual(values = rep(1,100), guide = 'none')
Before, we were missing the data in places, because the truncated basis did not have the ability to perfectly reconstruct the truth. Now, the truth lies within our uncertainty.
Just plotting 95% prediction intervals for clarity:
plot_data <- data.frame(Time = 1:t,
Truth = DataBasis$CentredField[,val_inds[1]],
Recon = rep(Recon(Preds_val$Expectation[1,], DataBasis$tBasis[,1:q]), 2),
Lower = c(apply(em_samp, 1, quantile, probs = 0.025), apply(em_samp + t(extra_var_samples), 1, quantile, probs = 0.025)),
Upper = c(apply(em_samp, 1, quantile, probs = 0.975), apply(em_samp + t(extra_var_samples), 1, quantile, probs = 0.975)),
Type = rep(c('EmVar', 'FullVar'), each = t))
plot_data2 <- melt(plot_data, id.vars = c('Time', 'Type'))
pal <- scales::hue_pal()(2)
ggplot(plot_data2, aes(Time, value, col = variable, linetype = variable)) +
geom_line(size = 0.8) +
facet_wrap(vars(Type)) +
scale_linetype_manual(values = c(1,1,2,2)) +
scale_colour_manual(values = pal[c(1,2,2,2)]) +
theme(legend.position = 'none')
Or doing for 4 different inputs (immediately adding in the variance from the discarded vectors):
em_samp <- array(0, dim = c(t, 100, 4))
plot_data <- NULL
for (i in 2:5){
plot_data <- rbind(plot_data,
data.frame(Time = 1:t,
Truth = DataBasis$CentredField[,val_inds[i]],
Recon = Recon(Preds_val$Expectation[i,], DataBasis$tBasis[,1:q]),
Run = i))
for (s in 1:100){
samp <- rnorm(q,
mean = Preds_val$Expectation[i,],
sd = sqrt(Preds_val$Variance[i,]))
rec <- Recon(samp, DataBasis$tBasis[,1:q])
em_samp[,s,i-1] <- rec
}
}
plot_data2 <- melt(plot_data, id.vars = c('Time', 'Run'))
extra_var_samples <- t(rmvnorm(400, rep(0, t), extra_var))
dim(extra_var_samples) <- c(t, 100, 4)
plot_data_samp <- data.frame(Time = 1:t,
value = c(em_samp + extra_var_samples),
s = rep(1:100, each = t),
Run = rep(2:5, each = t*100))
ggplot(plot_data2, aes(Time, value, col = variable)) +
facet_wrap(vars(Run)) +
geom_line(data = plot_data_samp, aes(Time, value, linetype = as.factor(s)), col = 'grey', alpha = 0.6) +
geom_line(size = 0.8) +
scale_linetype_manual(values = rep(1,100), guide = 'none')
We may want to see what’s happening across the output space is we systematically change values of the inputs. Here, the 2 dominant parameters are kMV and alpha, so let’s vary these.
Before we predicted over a large Latin hypercube. Now let’s systematically vary these 2 parameters, and fix the others at the centre of their range (zero):
ns <- 1000
pred_seq <- seq(from = -1, to = 1, by = 0.04)
pred_grid <- expand.grid(kMV = pred_seq, alpha = pred_seq)
NewDesign <- data.frame(pred_grid, lrrA = 0, lrrV = 0)
PredsNewBasis <- BasisPredGasp(NewDesign, em_coeffs)
Plotting the expectation of C1:
ggplot(data.frame(NewDesign, C1 = PredsNewBasis$Expectation[,1]),
aes(x = kMV, y = alpha, col = C1)) +
geom_point(size = 3, shape = 15) +
scale_colour_viridis(limits = c(-85,85))
Or C2:
ggplot(data.frame(NewDesign, C2 = PredsNewBasis$Expectation[,2]),
aes(x = kMV, y = alpha, col = C2)) +
geom_point(size = 3, shape = 15) +
scale_colour_viridis()
This is a 2D surface in the 4D input space with lrrA and lrrV both fixed to zero. Instead setting them both to 1 demonstrates different behaviour in the output:
ns <- 1000
pred_seq <- seq(from = -1, to = 1, by = 0.04)
pred_grid <- expand.grid(kMV = pred_seq, alpha = pred_seq)
NewDesign <- data.frame(pred_grid, lrrA = 1, lrrV = 1)
PredsNewBasis <- BasisPredGasp(NewDesign, em_coeffs)
ggplot(data.frame(NewDesign, C1 = PredsNewBasis$Expectation[,1]),
aes(x = kMV, y = alpha, col = C1)) +
geom_point(size = 3, shape = 15) +
scale_colour_viridis(limits = c(-85,85))
Depending on what we’re interested in (relationship on this 2D surface for specific values of other parameters, vs general behaviour integrated across all other uncertain inputs), instead of fixing the other inputs we could sample. For example, for each specific pair of \((kMV, alpha)\) values (each point on the plot), we could sample a Latin hypercube across the remaining parameters, average the emulator prediction across these points, and plot this instead.